Foot Traffic Analysis: Analyse the city's foot traffic activities during the night time specifically (night economy). The purpose is to analyse/track the city's night time activity that associated with city plan to revive the night economy. ideally analyse the foot traffic on 1 hour interval.
Date: March-July 2024
Duration: 90 mins
Level: Intermediate
Pre-requisite Skills: Python, basic machine learning, Optional Google Collaborate access
Dataset 4: pedestrian-network 85326 records
Dataset Link
Dataset Link
Dataset 4: (5) Business establishments location and industry classification 374210 records
Dataset Link
Dataset Link
Project Objective, Overview & Research
¶
Project Objective, Overview & Research
Context: To promote night economy by boosting local businesses and activities after hours in Melbourne City (7pm-7am). This use case examines pedestrian movement patterns during nighttime to support future planning and potential strategies that can be used for businesses in the area.¶
Objective: To track the city's nighttime foot traffic in order to identify active zones, time slots, and potential areas for development to stimulate the night economy.¶
- As a city council member, I want to analyze and understand the patterns of
nighttime foot traffic in Melbourne so that I can develop effective policies and initiatives to boost the Melbourne night scene and ensure safety and enjoyment for both residents and visitors.
- As a resident in Melbourne, I want to see the city's nighttime foot traffic so that I can identify areas which would suit my lifestyle, ultimately supporting local businesses.
Deliverables:¶
- A detailed report containing the analysis of nighttime foot traffic, including graphical content.
- Key findings and proposed strategies based on data.
Part 1¶
Data Preprocessing:¶
- Combine datasets to identify pedestrian traffic (locations, counts, & network paths).
- Clean (normalise) the data.
Temporal Analysis:¶
- Outline pedestrian counts during hourly intervals nighttime (6 PM to 6 AM).
- Identify peak activity times.
Part 2¶
Spatial Analysis: Map pedestrian counts using pedestrian network data to identify high vs low traffic areas (Pedestrian Counting System (sensors), (counts per hour) and Pedestrian Network)¶
Trend Analysis: Identity monthly and yearly analysis to find trends of nighttime foot traffic using sensors and pedestrain traffic. Additionally find parking locations that are nearby that would extend foot traffic areas (On Street Parking Bays).¶
Strategic Planning: Identifiy strategic locations for new businesses to assist in public transportation, (Business establishents location and industry classification).¶
Set up¶
In [1]:
# Dependencies
import warnings
warnings.filterwarnings("ignore")
import requests
import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)
In [2]:
# Lib
!pip install geopandas
import time
import sys
from io import StringIO
from datetime import datetime, date
import numpy as np
from tqdm.auto import tqdm
import pandas as pd
import geopandas
import json
import plotly.express as px
import seaborn as sns
import matplotlib.pyplot as plt
from shapely.geometry import Point
import geopandas as gpd
import folium
from folium.plugins import MarkerCluster
from scipy.spatial import cKDTree
from ipywidgets import interact_manual, Dropdown, Button, VBox
Collecting sodapy
Downloading sodapy-2.2.0-py2.py3-none-any.whl (15 kB)
Requirement already satisfied: requests>=2.28.1 in /usr/local/lib/python3.10/dist-packages (from sodapy) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.28.1->sodapy) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.28.1->sodapy) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.28.1->sodapy) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.28.1->sodapy) (2024.2.2)
Installing collected packages: sodapy
Successfully installed sodapy-2.2.0
Requirement already satisfied: geopandas in /usr/local/lib/python3.10/dist-packages (0.13.2)
Requirement already satisfied: fiona>=1.8.19 in /usr/local/lib/python3.10/dist-packages (from geopandas) (1.9.6)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from geopandas) (24.0)
Requirement already satisfied: pandas>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from geopandas) (2.0.3)
Requirement already satisfied: pyproj>=3.0.1 in /usr/local/lib/python3.10/dist-packages (from geopandas) (3.6.1)
Requirement already satisfied: shapely>=1.7.1 in /usr/local/lib/python3.10/dist-packages (from geopandas) (2.0.4)
Requirement already satisfied: attrs>=19.2.0 in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (23.2.0)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (2024.2.2)
Requirement already satisfied: click~=8.0 in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (8.1.7)
Requirement already satisfied: click-plugins>=1.0 in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (1.1.1)
Requirement already satisfied: cligj>=0.5 in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (0.7.2)
Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (1.16.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.0->geopandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.0->geopandas) (2023.4)
Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.0->geopandas) (2024.1)
Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.0->geopandas) (1.25.2)
Collecting pygeos
Downloading pygeos-0.14-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.2/2.2 MB 13.7 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.13 in /usr/local/lib/python3.10/dist-packages (from pygeos) (1.25.2)
Installing collected packages: pygeos
Successfully installed pygeos-0.14
Collecting mapclassify
Downloading mapclassify-2.6.1-py3-none-any.whl (38 kB)
Requirement already satisfied: networkx>=2.7 in /usr/local/lib/python3.10/dist-packages (from mapclassify) (3.3)
Requirement already satisfied: numpy>=1.23 in /usr/local/lib/python3.10/dist-packages (from mapclassify) (1.25.2)
Requirement already satisfied: pandas!=1.5.0,>=1.4 in /usr/local/lib/python3.10/dist-packages (from mapclassify) (2.0.3)
Requirement already satisfied: scikit-learn>=1.0 in /usr/local/lib/python3.10/dist-packages (from mapclassify) (1.2.2)
Requirement already satisfied: scipy>=1.8 in /usr/local/lib/python3.10/dist-packages (from mapclassify) (1.11.4)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas!=1.5.0,>=1.4->mapclassify) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas!=1.5.0,>=1.4->mapclassify) (2023.4)
Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas!=1.5.0,>=1.4->mapclassify) (2024.1)
Requirement already satisfied: joblib>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=1.0->mapclassify) (1.4.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=1.0->mapclassify) (3.4.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas!=1.5.0,>=1.4->mapclassify) (1.16.0)
Installing collected packages: mapclassify
Successfully installed mapclassify-2.6.1
In [60]:
!pip install plotly
Requirement already satisfied: plotly in /usr/local/lib/python3.10/dist-packages (5.15.0) Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from plotly) (8.2.3) Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from plotly) (24.0)
In [3]:
from google.colab import drive
drive.mount('/content/drive')
with open('/content/drive/My Drive/SIT378/h.txt', 'r') as file:
api_key = file.read().strip()
import os
api_key = os.getenv(api_key)
Mounted at /content/drive
In [4]:
# Define the company colors
color_d = ['#08af64', '#14a38e', '#0f9295', '#056b8a', '#121212'] #Dark theme
color_d_extend = ['#00918D', '#007E87', '#008883', '#0C5D70', '#51C293', '#008C9E', '#08af64', '#14a38e', '#0f9295', '#056b8a', '#121212']
color_night = ['#121212']
color_white = ['#B6B8D6']
color_l = ['#2af598', '#22e4ac', '#1bd7bb', '#14c9cb', '#0fbed8', '#08b3e5'] #Light theme
color_median = ['#9f1b04', '#7f10d0', '#cc8400', '#e480a1', '#4a71c7']
Load Dataset¶
- pedestrian-counting-system-monthly-counts-per-hour
In [8]:
# Download datasets
def download_dataset(api_key, dataset_id, base_url='https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/'):
format = 'csv'
url = f'{base_url}{dataset_id}/exports/{format}'
params = {
'select': '*',
'limit': -1, # all records
'lang': 'en',
'timezone': 'UTC',
'api_key': api_key
}
with requests.get(url, params=params, stream=True) as response:
if response.status_code == 200:
total_size = int(response.headers.get('content-length', 0))
chunk_size = 1024 # 1KB per chunk
progress_bar = tqdm(total=total_size, unit='iB', unit_scale=True, desc=f"Downloading {dataset_id}")
content = bytearray()
for chunk in response.iter_content(chunk_size=chunk_size):
if chunk: # filter out keep-alive new chunks
content.extend(chunk)
progress_bar.update(len(chunk))
progress_bar.close()
data = pd.read_csv(StringIO(content.decode('utf-8')), delimiter=';')
return data
else:
print(f'Request failed with status code {response.status_code}')
return None
# Dataset IDs
dataset_ids = [
'pedestrian-counting-system-monthly-counts-per-hour',
'on-street-parking-bays',
'pedestrian-network',
'pedestrian-counting-system-sensor-locations',
'business-establishments-with-address-and-industry-classification'
]
# Initialize dictionary to hold the datasets
datasets = {}
# Download each dataset with a progress bar
for dataset_id in dataset_ids:
datasets[dataset_id] = download_dataset(api_key, dataset_id)
if datasets[dataset_id] is not None:
print(f"{dataset_id} downloaded successfully.")
Downloading pedestrian-counting-system-monthly-counts-per-hour: 0.00iB [00:00, ?iB/s]
pedestrian-counting-system-monthly-counts-per-hour downloaded successfully.
Downloading on-street-parking-bays: 0.00iB [00:00, ?iB/s]
on-street-parking-bays downloaded successfully.
Downloading pedestrian-network: 0.00iB [00:00, ?iB/s]
pedestrian-network downloaded successfully.
Downloading pedestrian-counting-system-sensor-locations: 0.00iB [00:00, ?iB/s]
pedestrian-counting-system-sensor-locations downloaded successfully.
Downloading business-establishments-with-address-and-industry-classification: 0.00iB [00:00, ?iB/s]
business-establishments-with-address-and-industry-classification downloaded successfully.
View Datasets¶
In [9]:
print(datasets['pedestrian-counting-system-monthly-counts-per-hour'].head())
sensor_name timestamp locationid direction_1 \ 0 RMIT14_T 2023-05-03T09:00:00+00:00 61 294 1 RMIT14_T 2023-05-03T13:00:00+00:00 61 78 2 Lat224_T 2023-05-02T14:00:00+00:00 62 63 3 Lat224_T 2023-05-02T20:00:00+00:00 62 50 4 Lat224_T 2023-05-02T21:00:00+00:00 62 61 direction_2 total_of_directions location 0 379 673 -37.80767455, 144.96309114 1 64 142 -37.80767455, 144.96309114 2 46 109 -37.80996494, 144.96216521 3 27 77 -37.80996494, 144.96216521 4 42 103 -37.80996494, 144.96216521
In [10]:
print(datasets['on-street-parking-bays'].head())
roadsegmentid kerbsideid \
0 20751 8598
1 20751 8604
2 20751 8607
3 22499 8624
4 22499 8625
roadsegmentdescription latitude longitude \
0 Argyle Place South between Lygon Street and Ca... -37.803448 144.965682
1 Argyle Place South between Lygon Street and Ca... -37.803428 144.965496
2 Argyle Place South between Lygon Street and Ca... -37.803413 144.965361
3 Queensberry Street between Lygon Street and Ca... -37.804688 144.965314
4 Queensberry Street between Lygon Street and Ca... -37.804695 144.965379
lastupdated
0 2023-05-28
1 2023-05-28
2 2023-05-28
3 2023-05-28
4 2023-05-28
In [11]:
print(datasets['pedestrian-network'].head())
geo_point_2d \
0 -37.791128630450004, 144.9250621319
1 -37.790885197, 144.92478186034998
2 -37.79063325525, 144.92244485650002
3 -37.790601697949995, 144.9253671047
4 -37.790968201, 144.942393422
geo_shape objectid neworkid
0 {"coordinates": [[144.9251354935, -37.79109699... 5735 NaN
1 {"coordinates": [[144.9247703611, -37.79089170... 5738 NaN
2 {"coordinates": [[144.9224965319, -37.79068385... 5740 NaN
3 {"coordinates": [[144.9254894579, -37.79073841... 5743 NaN
4 {"coordinates": [[144.9424411919, -37.79097389... 5744 NaN
In [12]:
print(datasets['pedestrian-counting-system-sensor-locations'].head())
location_id sensor_description sensor_name \ 0 24 Spencer St-Collins St (North) Col620_T 1 25 Melbourne Convention Exhibition Centre MCEC_T 2 36 Queen St (West) Que85_T 3 37 Lygon St (East) Lyg260_T 4 41 Flinders La-Swanston St (West) Swa31 installation_date note location_type status \ 0 2013-09-02 NaN Outdoor A 1 2013-08-28 NaN Outdoor A 2 2015-01-20 Pushbox Upgrade, 03/08/2023 Outdoor A 3 2015-02-11 Pushbox Upgrade, 30/06/2023 Outdoor A 4 2017-06-29 NaN Outdoor A direction_1 direction_2 latitude longitude location 0 East West -37.818880 144.954492 -37.81887963, 144.95449198 1 East West -37.824018 144.956044 -37.82401776, 144.95604426 2 North South -37.816525 144.961211 -37.81652527, 144.96121062 3 North South -37.803103 144.966715 -37.80310271, 144.96671451 4 North South -37.816686 144.966897 -37.81668634, 144.96689733
In [13]:
print(datasets['business-establishments-with-address-and-industry-classification'].head())
census_year block_id property_id base_property_id clue_small_area \
0 2004 44 108110 108110 Melbourne (CBD)
1 2004 44 108110 108110 Melbourne (CBD)
2 2004 44 108111 108111 Melbourne (CBD)
3 2004 44 108111 108111 Melbourne (CBD)
4 2004 44 108111 108111 Melbourne (CBD)
trading_name business_address \
0 Total Tel International Part Level 12, 140-0 Queen Street MELBOURNE 3000
1 CJ Ham & Murray Part Floor 2, 140 Queen Street MELBOURNE 3000
2 Netherlands Consulate Suite 7, Part Level 4, 118-126 Queen Street ME...
3 Thomas Koutsoupias Suite 10, Level 9, 118-126 Queen Street MELBOU...
4 K S T Partners Suite 4, Floor 5, 118 Queen Street MELBOURNE 3000
industry_anzsic4_code industry_anzsic4_description longitude \
0 5809 Other Telecommunications Services 144.961185
1 6720 Real Estate Services 144.961185
2 7552 Foreign Government Representation 144.961308
3 6931 Legal Services 144.961308
4 6932 Accounting Services 144.961308
latitude
0 -37.815391
1 -37.815391
2 -37.815656
3 -37.815656
4 -37.815656
A. Save datasets locally¶
In [14]:
# Save
base_path = '/content/drive/My Drive/sit378_foot_traffic_analysis/'
for dataset_id, df in datasets.items():
if df is not None:
filename = f"{base_path}{dataset_id}.csv"
df.to_csv(filename, index=False)
print(f"Saved {filename} to Google Drive.")
Saved /content/drive/My Drive/sit378_foot_traffic_analysis/pedestrian-counting-system-monthly-counts-per-hour.csv to Google Drive. Saved /content/drive/My Drive/sit378_foot_traffic_analysis/on-street-parking-bays.csv to Google Drive. Saved /content/drive/My Drive/sit378_foot_traffic_analysis/pedestrian-network.csv to Google Drive. Saved /content/drive/My Drive/sit378_foot_traffic_analysis/pedestrian-counting-system-sensor-locations.csv to Google Drive. Saved /content/drive/My Drive/sit378_foot_traffic_analysis/business-establishments-with-address-and-industry-classification.csv to Google Drive.
B. Load datasets¶
In [15]:
# Load
base_path = '/content/drive/My Drive/sit378_foot_traffic_analysis/'
datasets = {} # Dictionary
# Filenames
dataset_filenames = {
'pedestrian_hour': 'pedestrian-counting-system-monthly-counts-per-hour.csv',
'on_street_parking': 'on-street-parking-bays.csv',
'pedestrian-network': 'pedestrian-network.csv',
'pedestrian-counting-system-sensor-locations': 'pedestrian-counting-system-sensor-locations.csv',
'business-establishments-with-address-and-industry-classification': 'business-establishments-with-address-and-industry-classification.csv'
}
# Load each dataset into the datasets dictionary
for dataset_id, filename in dataset_filenames.items():
full_path = f"{base_path}{filename}"
datasets[dataset_id] = pd.read_csv(full_path)
print(f"Loaded {dataset_id} with {datasets[dataset_id].shape[0]} records.")
Loaded pedestrian_hour with 549976 records. Loaded on_street_parking with 19162 records. Loaded pedestrian-network with 85326 records. Loaded pedestrian-counting-system-sensor-locations with 140 records. Loaded business-establishments-with-address-and-industry-classification with 374210 records.
In [20]:
# Dataframes
pedestrian_hour = datasets['pedestrian_hour']
on_street_parking = datasets['on_street_parking']
pedestrian_network = datasets['pedestrian-network']
sensor_locations = datasets['pedestrian-counting-system-sensor-locations']
business_establishments = datasets['business-establishments-with-address-and-industry-classification']
Data Preprocessing:¶
- Combine datasets to identify pedestrian traffic (locations, counts, & network paths).
- Clean (normalise) the data.
In [22]:
# Check missing values for dataset
print('pedestrian_network')
print(pedestrian_network.isnull().sum())
print("#" * 30)
print()
print('on_street_parking')
print(on_street_parking.isnull().sum())
print("#" * 30)
print()
print('pedestrian_hour')
print(pedestrian_hour.isnull().sum())
print("#" * 30)
print()
print('sensor_locations')
print(sensor_locations.isnull().sum())
print("#" * 30)
print('business_establishments')
print(business_establishments.isnull().sum())
print("#" * 30)
pedestrian_network geo_point_2d 0 geo_shape 0 objectid 0 neworkid 71060 dtype: int64 ############################## on_street_parking roadsegmentid 0 kerbsideid 14149 roadsegmentdescription 0 latitude 0 longitude 0 lastupdated 0 dtype: int64 ############################## pedestrian_hour sensor_name 0 timestamp 0 locationid 0 direction_1 0 direction_2 0 total_of_directions 0 location 0 dtype: int64 ############################## sensor_locations location_id 0 sensor_description 0 sensor_name 0 installation_date 2 note 107 location_type 0 status 0 direction_1 43 direction_2 43 latitude 0 longitude 0 location 0 dtype: int64 ############################## business_establishments census_year 0 block_id 0 property_id 0 base_property_id 0 clue_small_area 0 trading_name 127 business_address 1 industry_anzsic4_code 0 industry_anzsic4_description 0 longitude 4785 latitude 4785 dtype: int64 ##############################
Data Cleaning¶
- Find missing data
- pedestrian_network 85326 | missing 71060 network ids = 83.28% drop
- on_street_parking 19162 | misisng 14149 kerbsideid = 73.84% drop
- pedestrian_hour no NaNs
- sensor 140 | missing note 107 = 76.43% drop
- sensor 140 | missing direction_1 43 & direction_2 43 = 30.71% (attempt to clean)
- business_locations 374210 | missing long/ lat 4785 = 1.28% drop
- business_locations 374210 | missing trading_name 127 = 0.03% drop
- Merge dataframes (pedestrian hour and sensor locations)
- Creat a GeoDataFrame for pedestrian Network
In [23]:
# pedestrian_network drop
pedestrian_network = pedestrian_network.drop(columns=['neworkid'])
# on_street_parking drop
on_street_parking = on_street_parking.drop(columns=['kerbsideid'])
# sensor drop
sensor = sensor_locations.drop(columns=['note', 'installation_date'])
In [25]:
# Business establishments drop
business_establishments.dropna(subset=['business_address', 'trading_name', 'longitude', 'latitude'], inplace=True)
In [28]:
# Merge sensor_name
pedestrian_hour_sensor = pd.merge(pedestrian_hour, sensor_locations, on='sensor_name', how='left')
print(pedestrian_hour_sensor.head())
sensor_name timestamp locationid direction_1_x \
0 RMIT14_T 2023-05-03T09:00:00+00:00 61 294
1 RMIT14_T 2023-05-03T13:00:00+00:00 61 78
2 Lat224_T 2023-05-02T14:00:00+00:00 62 63
3 Lat224_T 2023-05-02T20:00:00+00:00 62 50
4 Lat224_T 2023-05-02T21:00:00+00:00 62 61
direction_2_x total_of_directions location_x \
0 379 673 -37.80767455, 144.96309114
1 64 142 -37.80767455, 144.96309114
2 46 109 -37.80996494, 144.96216521
3 27 77 -37.80996494, 144.96216521
4 42 103 -37.80996494, 144.96216521
location_id sensor_description installation_date note location_type \
0 61.0 RMIT Building 14 2019-06-28 NaN Outdoor
1 61.0 RMIT Building 14 2019-06-28 NaN Outdoor
2 62.0 La Trobe St (North) 2019-09-25 NaN Outdoor
3 62.0 La Trobe St (North) 2019-09-25 NaN Outdoor
4 62.0 La Trobe St (North) 2019-09-25 NaN Outdoor
status direction_1_y direction_2_y latitude longitude \
0 A North South -37.807675 144.963091
1 A North South -37.807675 144.963091
2 A East West -37.809965 144.962165
3 A East West -37.809965 144.962165
4 A East West -37.809965 144.962165
location_y
0 -37.80767455, 144.96309114
1 -37.80767455, 144.96309114
2 -37.80996494, 144.96216521
3 -37.80996494, 144.96216521
4 -37.80996494, 144.96216521
In [29]:
# split geo_point in lat/long
pedestrian_network = pd.DataFrame(pedestrian_network)
pedestrian_network[['lat', 'lon']] = pedestrian_network['geo_point_2d'].str.split(',', expand=True)
# print(pedestrian_network)
In [30]:
# GeoDataFrame
geometry = [Point(lon, lat) for lon, lat in zip(pedestrian_network['lon'], pedestrian_network['lat'])]
gpd_pedestrianpath = gpd.GeoDataFrame(pedestrian_network, geometry=geometry, crs="EPSG:4326")
gpd_pedestrianpath.drop(columns=['lat', 'lon'], inplace=True)
# print(gpd_pedestrianpath)
Overview Pedestrian Paths¶
- Overview of day and night data
- Review specific sensor points (ACMI)
- Review all sensor locations
In [32]:
# Specific location traffic ACMI
pedestrian_hour_sensor['timestamp'] = pd.to_datetime(pedestrian_hour_sensor['timestamp'])
specific_sensor_data_acmi_day_night = pedestrian_hour_sensor[pedestrian_hour_sensor['sensor_name'] == 'ACMI_T']
specific_sensor_data_acmi_day_night = specific_sensor_data_acmi_day_night.groupby('timestamp')['total_of_directions'].sum()
# Plot
plt.figure(figsize=(10, 5))
plt.plot(specific_sensor_data_acmi_day_night.index, specific_sensor_data_acmi_day_night.values, color=color_d[3], alpha=0.7)
plt.axhline(specific_sensor_data_acmi_day_night.mean(), color='purple', linestyle='--', label='Avg ACMI_T Traffic')
plt.title('Traffic Over Time for Sensor ACMI_T')
plt.xlabel('Time')
plt.ylabel('Total Traffic')
plt.grid(True)
plt.show()
In [33]:
specific_sensor_data_acmi_day_night.describe()
Out[33]:
count 6950.000000 mean 405.964604 std 377.874285 min 1.000000 25% 59.000000 50% 347.500000 75% 636.000000 max 5331.000000 Name: total_of_directions, dtype: float64
In [35]:
# Group data (Sensor and timestamp + traffic)
grouped_data_day_night = pedestrian_hour_sensor.groupby(['sensor_name', 'timestamp'])['total_of_directions'].sum().unstack(0)
# Multiple sensors
plt.figure(figsize=(12, 7))
for sensor in grouped_data_day_night.columns:
plt.plot(grouped_data_day_night.index, grouped_data_day_night[sensor], label=sensor)
plt.title('Traffic Over Time by Sensor')
plt.xlabel('Time')
plt.ylabel('Total Traffic')
plt.legend(title='Sensor Name', loc='upper left', bbox_to_anchor=(1.05, 1))
plt.grid(True)
plt.tight_layout()
plt.show()
In [37]:
grouped_data_day_night.describe()
Out[37]:
| sensor_name | 261Will_T | 280Will_T | 474Fl_T | 488Mac_T | 574Qub_T | ACMI_T | AG_T | AlfPl_T | BirBridge_T | BirFed1120_T | Bou231_T | Bou283_T | Bou292_T | Bou655_T | Bou688_T | Bou892T | BouBri_T | BouHbr2353_T | BouHbr_T | Boyd2837_T | Col12_T | Col15_T | Col254_T | Col620_T | Col623_T | Col700_T | Col892T | ElFi_T | Eli250_T | Eli263_T | Eli380_T | Eli483_T | Eli501_T | Errol20_T | Errol23_T | FLDegC_T | FLDegN_T | FLDegS_T | FedCycle_T | FedPed_T | Fli114C_T | Fli114F_T | FliSS_T | FliS_T | Fra118_T | Grat292_T | Hammer1584_T | HarEsB_T | HarEsP_T | KenMac_T | King2_T | King335_T | Lat224_T | LatWill_T | Lon189_T | Lon364_T | LtB170_T | LtB210_T | Lyg161_T | Lyg260_T | Lyg309_T | MCEC_T | NewQ_T | Pel147_T | PriNW_T | QVMQ_T | QVN_T | Que85_T | RMIT14_T | RMIT_T | Rus180_T | SanBri_T | Signal1647_T | SouthB_T | Spen161_T | Spen201_T | Spen229_T | Spr201_T | SprFli_T | Swa123_T | Swa148_T | Swa295_T | Swa31 | Swa607_T | SwaCs_T | UM1_T | UM2_T | UM3_T | VAC_T | Vic_T | WatCit_T | WebBN_T | WestWP_T |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 6687.000000 | 6184.000000 | 2651.000000 | 6682.000000 | 6757.000000 | 6950.000000 | 7445.000000 | 6241.000000 | 3316.000000 | 2575.000000 | 6926.000000 | 7187.000000 | 7033.000000 | 462.000000 | 7484.000000 | 569.000000 | 4606.000000 | 2457.000000 | 7159.000000 | 1546.000000 | 6559.000000 | 7464.000000 | 7462.000000 | 7414.000000 | 6920.000000 | 7014.000000 | 1211.000000 | 7004.000000 | 6939.000000 | 6928.000000 | 5698.000000 | 2736.000000 | 6553.000000 | 7287.000000 | 7269.000000 | 6963.000000 | 6490.000000 | 7502.000000 | 5932.000000 | 5898.000000 | 6269.000000 | 7346.000000 | 7289.000000 | 7513.000000 | 6829.000000 | 435.000000 | 2844.000000 | 6772.000000 | 7124.000000 | 6738.000000 | 2827.000000 | 2860.000000 | 7427.000000 | 7210.000000 | 6855.000000 | 7337.000000 | 6836.000000 | 6730.000000 | 6916.000000 | 5241.000000 | 6854.000000 | 4414.000000 | 7469.000000 | 5842.000000 | 6468.000000 | 6594.000000 | 7436.000000 | 7117.000000 | 7282.000000 | 7511.000000 | 6489.000000 | 7382.000000 | 2349.000000 | 7243.000000 | 2861.000000 | 2815.000000 | 2861.000000 | 7135.000000 | 6086.000000 | 6994.000000 | 7160.000000 | 7515.000000 | 7472.000000 | 6951.000000 | 7235.000000 | 2073.000000 | 4293.000000 | 5007.000000 | 7297.000000 | 7142.000000 | 7024.000000 | 7190.000000 | 6857.000000 |
| mean | 405.031554 | 145.880821 | 197.830630 | 93.821610 | 56.637561 | 405.964604 | 348.342243 | 127.767505 | 39.319662 | 189.899417 | 419.493070 | 621.212745 | 1029.900754 | 216.751082 | 633.533137 | 40.920914 | 454.746852 | 55.475376 | 98.500629 | 142.347995 | 218.294405 | 362.817256 | 558.034575 | 830.636903 | 317.745087 | 466.085258 | 29.752271 | 1182.510137 | 913.928952 | 389.096276 | 1079.430151 | 456.206871 | 253.499161 | 162.296693 | 110.055991 | 165.005170 | 194.966256 | 466.077446 | 70.864295 | 248.417430 | 30.613814 | 230.622380 | 625.592400 | 852.287901 | 131.679748 | 175.501149 | 590.021449 | 63.215741 | 177.753930 | 67.045562 | 158.435798 | 170.997552 | 228.006598 | 206.950485 | 425.607877 | 287.292899 | 326.314950 | 541.796137 | 199.646906 | 109.136424 | 224.931719 | 670.590847 | 222.686705 | 93.192229 | 1165.311843 | 245.925842 | 1099.181818 | 224.142897 | 430.894946 | 724.985621 | 502.158114 | 370.187212 | 97.080885 | 1036.972663 | 493.621111 | 467.793961 | 670.765816 | 216.667134 | 73.726914 | 1488.440520 | 678.308520 | 1199.210379 | 1705.917827 | 186.873687 | 566.151486 | 183.835986 | 146.238994 | 142.372678 | 722.608058 | 136.041865 | 111.940632 | 200.096245 | 25.356278 |
| std | 485.143904 | 119.553327 | 133.239904 | 75.655576 | 46.290549 | 377.874285 | 457.730530 | 129.570908 | 60.352792 | 369.432044 | 317.611887 | 687.371979 | 1070.424036 | 196.006709 | 522.747853 | 34.232105 | 901.958827 | 63.853127 | 108.192349 | 64.483945 | 282.589181 | 391.112846 | 540.030877 | 679.976485 | 301.395793 | 692.355900 | 23.689096 | 897.054660 | 749.464920 | 304.980070 | 786.829308 | 382.157033 | 229.062175 | 135.060428 | 93.371579 | 167.479003 | 172.956293 | 380.753253 | 81.348463 | 532.749304 | 25.876258 | 211.831268 | 473.330632 | 673.099758 | 103.250481 | 253.103878 | 612.136901 | 89.192001 | 180.779591 | 54.838090 | 117.788533 | 123.354520 | 160.970999 | 194.081081 | 319.310837 | 224.216332 | 308.146188 | 480.042581 | 185.554694 | 121.354154 | 201.667438 | 640.030338 | 255.124785 | 91.367629 | 1026.952150 | 253.494147 | 1004.418646 | 189.143552 | 386.177077 | 724.689877 | 401.832270 | 301.606842 | 97.471698 | 842.000843 | 417.407065 | 327.527708 | 451.525393 | 202.739631 | 61.797717 | 1330.923215 | 594.778154 | 971.001614 | 1300.803204 | 172.244378 | 494.609776 | 222.579035 | 189.299122 | 153.457336 | 697.376112 | 121.383867 | 124.540049 | 184.142205 | 24.552942 |
| min | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 1.000000 | 1.000000 | 8.000000 | 8.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 1.000000 | 3.000000 | 2.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 3.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
| 25% | 69.000000 | 30.000000 | 72.000000 | 14.000000 | 10.000000 | 59.000000 | 42.000000 | 23.000000 | 4.000000 | 22.000000 | 108.250000 | 47.000000 | 72.000000 | 47.000000 | 131.000000 | 19.000000 | 30.250000 | 9.000000 | 13.000000 | 109.000000 | 13.000000 | 52.000000 | 50.000000 | 182.000000 | 54.000000 | 31.000000 | 12.000000 | 252.000000 | 168.000000 | 62.000000 | 269.000000 | 122.000000 | 79.000000 | 12.000000 | 13.000000 | 13.000000 | 31.000000 | 62.000000 | 15.000000 | 19.000000 | 10.000000 | 42.000000 | 157.000000 | 188.000000 | 31.000000 | 9.000000 | 65.000000 | 8.000000 | 22.000000 | 11.000000 | 53.000000 | 45.000000 | 87.000000 | 39.000000 | 118.000000 | 60.000000 | 53.000000 | 65.000000 | 24.000000 | 12.000000 | 15.000000 | 89.000000 | 33.000000 | 17.000000 | 165.000000 | 43.000000 | 103.000000 | 48.000000 | 67.000000 | 114.500000 | 101.000000 | 91.000000 | 20.000000 | 223.000000 | 101.000000 | 144.500000 | 183.000000 | 27.000000 | 27.000000 | 178.000000 | 98.000000 | 218.000000 | 336.750000 | 36.000000 | 90.000000 | 23.000000 | 15.000000 | 24.000000 | 63.000000 | 27.000000 | 22.000000 | 21.000000 | 7.000000 |
| 50% | 224.000000 | 133.000000 | 205.000000 | 97.000000 | 54.000000 | 347.500000 | 222.000000 | 86.000000 | 12.000000 | 109.000000 | 395.500000 | 330.000000 | 574.000000 | 169.500000 | 561.000000 | 35.000000 | 172.500000 | 29.000000 | 65.000000 | 149.000000 | 86.000000 | 216.000000 | 387.500000 | 775.000000 | 253.000000 | 136.500000 | 26.000000 | 1194.500000 | 831.000000 | 381.000000 | 1018.500000 | 399.000000 | 199.000000 | 171.000000 | 105.000000 | 98.000000 | 149.500000 | 459.000000 | 50.000000 | 88.000000 | 27.000000 | 212.000000 | 632.000000 | 837.000000 | 124.000000 | 67.000000 | 452.500000 | 42.000000 | 156.000000 | 64.000000 | 154.000000 | 188.000000 | 222.000000 | 161.000000 | 401.000000 | 275.000000 | 257.500000 | 424.500000 | 174.500000 | 73.000000 | 208.000000 | 526.000000 | 182.000000 | 68.000000 | 1016.000000 | 185.500000 | 817.000000 | 194.000000 | 382.000000 | 561.000000 | 445.000000 | 357.000000 | 82.000000 | 951.000000 | 410.000000 | 487.000000 | 767.000000 | 168.000000 | 69.000000 | 1153.000000 | 563.000000 | 991.000000 | 1749.000000 | 148.000000 | 516.000000 | 98.000000 | 77.000000 | 90.000000 | 569.000000 | 120.000000 | 90.000000 | 171.500000 | 17.000000 |
| 75% | 609.000000 | 235.000000 | 296.000000 | 155.000000 | 92.000000 | 636.000000 | 491.000000 | 200.000000 | 54.000000 | 212.500000 | 660.750000 | 1113.000000 | 1976.000000 | 334.000000 | 978.000000 | 52.000000 | 552.750000 | 84.000000 | 150.000000 | 185.000000 | 382.000000 | 572.000000 | 1007.000000 | 1222.750000 | 469.000000 | 678.000000 | 41.000000 | 1915.000000 | 1575.000000 | 660.000000 | 1756.750000 | 654.000000 | 332.000000 | 272.000000 | 184.000000 | 288.000000 | 330.000000 | 791.750000 | 87.000000 | 194.000000 | 43.000000 | 345.000000 | 975.000000 | 1292.000000 | 207.000000 | 198.000000 | 929.750000 | 95.000000 | 273.000000 | 114.000000 | 235.000000 | 266.000000 | 345.000000 | 307.000000 | 671.000000 | 470.000000 | 514.000000 | 963.000000 | 309.000000 | 173.000000 | 387.000000 | 1078.000000 | 322.000000 | 139.000000 | 1872.500000 | 335.000000 | 1996.000000 | 346.000000 | 677.000000 | 1089.000000 | 856.000000 | 546.000000 | 141.000000 | 1616.500000 | 818.000000 | 694.000000 | 1029.000000 | 356.000000 | 105.000000 | 2672.000000 | 1132.250000 | 2089.000000 | 2835.000000 | 283.000000 | 912.000000 | 250.000000 | 189.000000 | 213.000000 | 1171.000000 | 201.000000 | 161.000000 | 338.000000 | 37.000000 |
| max | 2813.000000 | 1023.000000 | 907.000000 | 340.000000 | 241.000000 | 5331.000000 | 5926.000000 | 890.000000 | 608.000000 | 5680.000000 | 1658.000000 | 3791.000000 | 5188.000000 | 1053.000000 | 2997.000000 | 274.000000 | 10387.000000 | 518.000000 | 1898.000000 | 316.000000 | 1676.000000 | 1866.000000 | 2368.000000 | 3518.000000 | 2595.000000 | 4137.000000 | 158.000000 | 4943.000000 | 3047.000000 | 1105.000000 | 3237.000000 | 2217.000000 | 2252.000000 | 915.000000 | 429.000000 | 829.000000 | 772.000000 | 1413.000000 | 548.000000 | 6663.000000 | 290.000000 | 3316.000000 | 3217.000000 | 7367.000000 | 614.000000 | 905.000000 | 6908.000000 | 1909.000000 | 2433.000000 | 255.000000 | 1161.000000 | 1433.000000 | 3692.000000 | 1592.000000 | 2792.000000 | 1061.000000 | 2241.000000 | 2135.000000 | 1158.000000 | 868.000000 | 915.000000 | 4960.000000 | 5782.000000 | 519.000000 | 6271.000000 | 2372.000000 | 4132.000000 | 935.000000 | 3394.000000 | 5269.000000 | 2814.000000 | 3513.000000 | 1173.000000 | 8526.000000 | 3858.000000 | 1981.000000 | 2150.000000 | 1184.000000 | 985.000000 | 5963.000000 | 3878.000000 | 4905.000000 | 5535.000000 | 1030.000000 | 3151.000000 | 1127.000000 | 1721.000000 | 1683.000000 | 4807.000000 | 787.000000 | 3224.000000 | 1307.000000 | 324.000000 |
Filter Data by Night Traffic 7pm-7am¶
- There are 310256 out of 549976 records = 56.41% of dataset
Temporal Analysis:¶
- Outline pedestrian counts during hourly intervals nighttime (7 PM to 6 AM).
- Identify peak activity times.
- Explore weekday & weekend traffic to understand different patterns.
In [39]:
# Convert datetime format
pedestrian_hour_sensor['timestamp'] = pd.to_datetime(pedestrian_hour_sensor['timestamp'])
# Set condition
pedestrian_hour_sensor['hour'] = pedestrian_hour_sensor['timestamp'].dt.hour
nighttime_data = pedestrian_hour_sensor[(pedestrian_hour_sensor['hour'] >= 19) | (pedestrian_hour_sensor['hour'] <= 7)]
In [40]:
# Display nighttime data
print(nighttime_data.head())
nighttime_data_len = len(nighttime_data)
print(f'The dataset contains {nighttime_data_len} records.')
sensor_name timestamp locationid direction_1_x \
3 Lat224_T 2023-05-02 20:00:00+00:00 62 50
4 Lat224_T 2023-05-02 21:00:00+00:00 62 61
5 Lat224_T 2023-05-02 23:00:00+00:00 62 106
6 Lat224_T 2023-05-03 02:00:00+00:00 62 211
7 Lat224_T 2023-05-03 03:00:00+00:00 62 191
direction_2_x total_of_directions location_x \
3 27 77 -37.80996494, 144.96216521
4 42 103 -37.80996494, 144.96216521
5 79 185 -37.80996494, 144.96216521
6 147 358 -37.80996494, 144.96216521
7 108 299 -37.80996494, 144.96216521
location_id sensor_description installation_date note location_type \
3 62.0 La Trobe St (North) 2019-09-25 NaN Outdoor
4 62.0 La Trobe St (North) 2019-09-25 NaN Outdoor
5 62.0 La Trobe St (North) 2019-09-25 NaN Outdoor
6 62.0 La Trobe St (North) 2019-09-25 NaN Outdoor
7 62.0 La Trobe St (North) 2019-09-25 NaN Outdoor
status direction_1_y direction_2_y latitude longitude \
3 A East West -37.809965 144.962165
4 A East West -37.809965 144.962165
5 A East West -37.809965 144.962165
6 A East West -37.809965 144.962165
7 A East West -37.809965 144.962165
location_y hour
3 -37.80996494, 144.96216521 20
4 -37.80996494, 144.96216521 21
5 -37.80996494, 144.96216521 23
6 -37.80996494, 144.96216521 2
7 -37.80996494, 144.96216521 3
The dataset contains 310256 records.
In [43]:
# Find heaviest traffic locations
# Group (location and hour) = total traffic
traffic_per_location_hour = nighttime_data.groupby(['location_x'])['total_of_directions'].sum().reset_index()
# Heaviest locations/ times
heaviest_locations_times = traffic_per_location_hour.sort_values(by='total_of_directions', ascending=False).head()
print(heaviest_locations_times)
location_x total_of_directions 57 -37.81668634, 144.96689733 9228807 48 -37.81487988, 144.9660878 7709868 70 -37.81798049, 144.96503383 5817965 40 -37.81349441, 144.96515323 5736479 24 -37.81057846, 144.96444294 5673737
In [44]:
# Look at time sequence of data for ACMI Traffic NIGHT
specific_sensor_data_acmi_night = nighttime_data[nighttime_data['sensor_name'] == 'ACMI_T']
specific_sensor_data_acmi_night = specific_sensor_data_acmi_night.groupby('timestamp')['total_of_directions'].sum()
# Plot
plt.figure(figsize=(10, 5))
plt.plot(specific_sensor_data_acmi_night.index, specific_sensor_data_acmi_night.values, color=color_d[3], alpha=0.7)
plt.axhline(specific_sensor_data_acmi_night.mean(), color='purple', linestyle='--', label='Avg ACMI_T Traffic')
plt.title('Traffic Over Time for Sensor ACMI_T')
plt.xlabel('Time')
plt.ylabel('Total Traffic')
plt.grid(True)
plt.show()
In [45]:
specific_sensor_data_acmi_night.describe()
Out[45]:
count 3731.000000 mean 535.459394 std 338.964905 min 5.000000 25% 279.000000 50% 525.000000 75% 748.500000 max 2543.000000 Name: total_of_directions, dtype: float64
In [46]:
# Group data (Sensor and timestamp + traffic)
grouped_data = nighttime_data.groupby(['sensor_name', 'timestamp'])['total_of_directions'].sum().unstack(0)
# Multiple sensors
plt.figure(figsize=(12, 7))
for sensor in grouped_data.columns:
plt.plot(grouped_data.index, grouped_data[sensor], label=sensor)
plt.title('Traffic Over Time by Sensor')
plt.xlabel('Time')
plt.ylabel('Total Traffic')
plt.legend(title='Sensor Name', loc='upper left', bbox_to_anchor=(1.05, 1))
plt.grid(True)
plt.tight_layout()
plt.show()
In [47]:
# Remove NaN values
cleaned_nighttime_data = nighttime_data.dropna(subset=['latitude', 'longitude'])
mean_lat = cleaned_nighttime_data['latitude'].mean()
mean_lon = cleaned_nighttime_data['longitude'].mean()
map_folium = folium.Map(location=[mean_lat, mean_lon], zoom_start=13)
heat_data = [
[row['latitude'], row['longitude'], row['total_of_directions']]
for index, row in cleaned_nighttime_data.iterrows()
]
# HeatMap
HeatMap(heat_data, radius=15, max_zoom=13).add_to(map_folium)
map_folium.save('Nighttime_Pedestrian_Traffic.html')
map_folium
Out[47]:
Make this Notebook Trusted to load map: File -> Trust Notebook